Clustering Methods For Spatial Datamining
نویسنده
چکیده
Abstra t We investigate the use of biased sampling a ording to the density of the dataset, to speed up the operation of general data mining tasks, su h as lustering and outlier dete tion in large multidimensional datasets. In density-biased sampling, the probability that a given point will be in luded in the sample depends on the lo al density of the dataset. We propose a general te hnique for density-biased sampling that an fa tor in user requirements to sample for properties of interest, and an be tuned for spe i data mining tasks. This allows great exibility, and improved a ura y of the results over simple random sampling. We des ribe our approa h in detail, we analyti ally evaluate it, and show how it an be optimized for approximate lustering and outlier dete tion. Finally we present a thorough experimental evaluation of the proposed method, applying density-biased sampling on real and syntheti data sets, and employing lustering and outlier dete tion algorithms, thus highlighting the utility of our approa h.
منابع مشابه
Investigation of effective factors in expanding electronic payment in Iran using datamining techniques
E-banking has grown dramatically with the development of ICT industry and banks offer their services to customers from different channels. Nowadays, considering the great economic benefits of electronic banking systems, the need to pay attention to the expansion of electronic banking is increasingly felt in terms of reducing costs and increasing the bank's profitability. The purpose of this stu...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملPrivacy of Data, Preserving in Data Mining
Huge volume of detailed personal data is regularly collected and sharing of these data is proved to be beneficial for data mining application. Such data include shopping habits, criminal records,medical history, credit records etc .On one hand such data is an important asset to business organization and governments for decision making by analyzing it .On the other hand privacy regulations and o...
متن کاملLimitations of the SOM and the GTM
Datamining is becoming more and more popular thanks to the rapid development of computers and the need to extract information out of increasingly large data collections. Within datamining one interesting field is to visualize the data to obtain a better understanding. One common approach is clustering with topology preservation, which can be achieved with the very popular Som algorithm. Very si...
متن کاملAnalysis of Extended Performance for clustering of Satellite Images Using Bigdata Platform Spark
Due to the recent emergence Clustering techniques have been widely adopted in many real world data analysis applications, such as customer behavior analysis, targeted marketing, digital forensics, etc. As the satellite imagery is getting generated at a higher rate than the previous decades, it becomes essential to have better solutions in terms of accuracy as well as performance. In this paper,...
متن کاملSpatial Analysis of COVID-19 and Exploration of Its Environmental and Socio-Demographic Risk Factors Using Spatial Statistical Methods: A Case Study of Iran
Background: Iran detected its first COVID-19 case in February 2020 in Qom province, which rapidly spread to other cities in the country. Iran, as one of those countries with the highest number of infected people, has officially reported 1812 deaths from a total number of 23049 confirmed infected cases that we used in the analysis. Materials and Methods: Geographic distribution by the map of ca...
متن کامل